R is very handy for it’s interactive command line interface. Later we will also explore how to make R reusable with scripts, but for now we will focus on typing at the command prompt to get comfortable.
To get started type: ‘R’ at your command line.
What version of R do you have?
Or
version
## _
## platform x86_64-w64-mingw32
## arch x86_64
## os mingw32
## system x86_64, mingw32
## status
## major 3
## minor 5.1
## year 2018
## month 07
## day 02
## svn rev 74947
## language R
## version.string R version 3.5.1 (2018-07-02)
## nickname Feather Spray
We can now get started with the R command promp open.
x=2
print(x) ##Print method
## [1] 2
class(x)
## [1] "numeric"
x=seq(1:10) # Create a vector
class(x)
## [1] "integer"
print(x)
## [1] 1 2 3 4 5 6 7 8 9 10
print(x[1]) # First index of vector
## [1] 1
print(x[1:5])
## [1] 1 2 3 4 5
y = matrix(nrow=5, ncol=5) # create a 5x5 matrix
print(y)
## [,1] [,2] [,3] [,4] [,5]
## [1,] NA NA NA NA NA
## [2,] NA NA NA NA NA
## [3,] NA NA NA NA NA
## [4,] NA NA NA NA NA
## [5,] NA NA NA NA NA
class(y)
## [1] "matrix"
y[1,1] = 5
print(y)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 5 NA NA NA NA
## [2,] NA NA NA NA NA
## [3,] NA NA NA NA NA
## [4,] NA NA NA NA NA
## [5,] NA NA NA NA NA
y[,1]= x[1:5]
print(y)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 NA NA NA NA
## [2,] 2 NA NA NA NA
## [3,] 3 NA NA NA NA
## [4,] 4 NA NA NA NA
## [5,] 5 NA NA NA NA
class(y[,1])
## [1] "numeric"
y = cbind(seq(1:5),
seq(1:5),
seq(1:5),
seq(1:5),
seq(1:5))
class(y)
## [1] "matrix"
Throughout this semester we will be using small shared data files I am storing on our course development GitHub repository. Go to https://github.com/rsh249/bioinformatics.git and Download the repository. Unpack it somewhere accessible to you (i.e., your Documents or Desktop folders). Then:
setwd('/path/to/repository')
read.table() read.csv() read.delim()
cars = read.table('./data/mtcars.csv', header=T, sep = ',') # Read a comma separated values file
head(cars)
cars = read.csv('./data/mtcars.csv')
cars = read.csv2('./data/mtcars.csv') ## Interesting behavior here, will be somewhat faster
cars = read.delim('./data/mtcars.csv', sep=',')
One of R’s biggest advantages is the ability to create high quality graphics in nearly any format or style. Today we will be working with the basic plotting features but later we will take a look at the ggplot library. ggplot is the current leader in graphics for R.
head(cars)
## model mpg cyl disp hp drat wt qsec vs am gear carb
## 1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## 2 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## 3 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## 4 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## 5 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## 6 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
plot(cars)
Repeating tasks using loops
for(i in 1:10) {
print(i)
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
Catch loop output in a vector or list
li = vector()
for(i in 1:10){
li[[i]]=log(i)
}
The Apply functions in R provide efficient repetition that usually out-performs for loops.
print(y) #our matrix from earlier
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 1 1 1 1
## [2,] 2 2 2 2 2
## [3,] 3 3 3 3 3
## [4,] 4 4 4 4 4
## [5,] 5 5 5 5 5
y = as.data.frame(y)
li1 = apply(y, 1, sum) # row-wise
li2 = apply(y, 2, sum) # column-wise
li2 = lapply(y[,1], log) #returns list
li2 = sapply(y[,1], log) #returns vector
#replicate an operation, a wrapper for sapply
rep = replicate(10, log(y[,1]))
A subset of R packages known as the tidyverse provides loads of useful tools. Here’s how to use some of those to make cool looking maps from Google maps data. This is a great example of the power of R’s community. I would have no idea where to start to make maps like these from scratch. But we do not have to start from nothing because functions like these exist. This is the “cookbook” approach (just follow the instructions) and it can be highly effective.
library(tidyverse)
## -- Attaching packages ----------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.0.0 v purrr 0.2.5
## v tibble 1.4.2 v dplyr 0.7.6
## v tidyr 0.8.1 v stringr 1.3.1
## v readr 1.1.1 v forcats 0.3.0
## -- Conflicts -------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(mapdata)
## Loading required package: maps
##
## Attaching package: 'maps'
## The following object is masked from 'package:purrr':
##
## map
library(maps)
library(ggmap)
library(magrittr)
##
## Attaching package: 'magrittr'
## The following object is masked from 'package:ggmap':
##
## inset
## The following object is masked from 'package:purrr':
##
## set_names
## The following object is masked from 'package:tidyr':
##
## extract
If any of these fail try and install package ‘tidverse’
One of the best parts of these tools is the built in access to Google maps aerial imagery.
loc = cbind(-73.973917, 40.781799)
loc = as.data.frame(loc)
colnames(loc) = c('lon', 'lat')
bkmap <- get_map(location = loc, maptype = "satellite", source = "google", zoom =14)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.781799,-73.973917&zoom=14&size=640x640&scale=2&maptype=satellite&language=en-EN&sensor=false
ggmap(bkmap) +
geom_point(data = loc,
color = "red",
size =4)
bkmap3 <- get_map(location = loc, maptype = "terrain", source = "google", zoom = 12)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.781799,-73.973917&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
ggmap(bkmap3) +
geom_point(data = loc,
color = "red",
size =4)
bkmap4 <- get_map(location = loc, maptype = "toner-lite", source = "google", zoom = 10)
## maptype = "toner-lite" is only available with source = "stamen".
## resetting to source = "stamen"...
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.781799,-73.973917&zoom=10&size=640x640&scale=2&maptype=terrain&sensor=false
## Map from URL : http://tile.stamen.com/toner-lite/10/300/383.png
## Map from URL : http://tile.stamen.com/toner-lite/10/301/383.png
## Map from URL : http://tile.stamen.com/toner-lite/10/302/383.png
## Map from URL : http://tile.stamen.com/toner-lite/10/300/384.png
## Map from URL : http://tile.stamen.com/toner-lite/10/301/384.png
## Map from URL : http://tile.stamen.com/toner-lite/10/302/384.png
## Map from URL : http://tile.stamen.com/toner-lite/10/300/385.png
## Map from URL : http://tile.stamen.com/toner-lite/10/301/385.png
## Map from URL : http://tile.stamen.com/toner-lite/10/302/385.png
ggmap(bkmap4) +
geom_point(data = loc,
color = "red",
size =4)
Create a map like one of these with your hometown at the center of it and post it to #maps
Working in groups of 2-4 design a small data collection project that you can carry out in nicer weather. Go outside and observe something in nature that you can take quantitative measurements on. Record ~20 measurements per member. Agree on the type of observation and measurement ahead of time and bring the data to class on Wednesday for more plotting in R. Consider recording a categorical value too (i.e., measure leaf length for 2 types of plants; count flower petal number for 4 types of flowers; count number of students standing in line at DD vs Starbucks).